Comparing Tweets and Tags for URLs

نویسندگان

  • Morgan Harvey
  • Mark James Carman
  • David Elsweiler
چکیده

The free-form tags available from social bookmarking sites such as Delicious have been shown to be useful for a number of purposes and could serve as a cheap source of metadata about URLs on the web. Unfortunately recent years have seen a reduction in the popularity of such sites, however at the same time microblogging sites such as Twitter have exploded in popularity. On these sites users submit short messages (or “tweets”) about what they are currently reading, thinking and doing and often post URLs. In this work we look into the similarity between top tags drawn from Delicious and high-frequency terms from tweets to ascertain whether Twitter data could serve as a useful replacement for Delicious. We investigate how these terms compare with web page content, whether or not top Twitter terms converge and determine if the terms are mostly descriptive (and therefore useful) or if they are mostly expressing sentiment or emotion. We discover that provided a large number of tweets are available referring to a chosen URL then the top terms drawn from these tweets are similar to Delicious tags and could therefore be used for similar purposes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PRIS at 2012 Microblog Track

Take account of that most tags are keyword rich and indicate the topic of tweets directly, but there was no space between two words. So Word Segmentation was used to separate the tags by space. This time we used the former max matching algorithm. The problem is that no dictionary is appropriate. Common words dictionary is partial and Oxford Dictionary doesn’t distinguish plurality. Then we made...

متن کامل

A Distributed System for Detecting Phishing and Mail Alert based Malicious Tweet URLs Blocker in a Twitter Stream

Twitter is a hugely well-liked famous social network where people exchanges messages of 140 characters called tweets. Because of short content size, and use of URL, it is difficult to detect phishing on Twitter unlike emails. Ease of information exchange large audience makes Twitter as a popular medium to spread external content like articles, videos, and photographs by embedding URLs in tweets...

متن کامل

WarningBird: Detecting Suspicious URLs in Twitter Stream

Twitter can suffer from malicious tweets containing suspicious URLs for spam, phishing, and malware distribution. Previous Twitter spam detection schemes have used account features such as the ratio of tweets containing URLs and the account creation date, or relation features in the Twitter graph. Malicious users, however, can easily fabricate account features. Moreover, extracting relation fea...

متن کامل

Mail_Alert: Online Suspicious URL Detection of Tweets from Twitter Public Timeline

Twitter, a famous social networking site where thousands of users use it to tweet to the world, is prone to spam, phishing, and malware distribution. Tweets are the atomic building blocks of Twitter, 140-character status updates with additional associated metadata. People tweet for a variety of reasons about a multitude of topics. Traditional spam detection scheme for twitter are ineffective ag...

متن کامل

Tweet-Recommender: Finding Relevant Tweets for News Articles

Twitter has become a prime source for disseminating news and opinions. However, the length of tweets prohibits detailed descriptions; instead, tweets sometimes contain URLs that link to detailed news articles. In this paper, we devise generic techniques for recommending tweets for any given news article. To evaluate and compare the different techniques, we collected tens of thousands of tweets ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012